Autoregressive clustering for HMM speech synthesis
نویسندگان
چکیده
The autoregressive HMM has been shown to provide efficient parameter estimation and high-quality synthesis, but in previous experiments decision trees derived from a non-autoregressive system were used. In this paper we investigate the use of autoregressive clustering for autoregressive HMM-based speech synthesis. We describe decision tree clustering for the autoregressive HMM and highlight differences to the standard clustering procedure. Subjective listening evaluation results suggest that autoregressive clustering improves the naturalness of the resulting speech. We find that the standard minimum description length (MDL) criterion for selecting model complexity is inappropriate for the autoregressive HMM. Investigating the effect of model complexity on naturalness, we find that a large degree of overfitting is tolerated without a substantial decrease in naturalness.
منابع مشابه
A formulation of the autoregressive HMM for speech synthesis
We present a formulation of the autoregressive HMM for speech synthesis and compare it to the standard HMM synthesis framework and the trajectory HMM. We give details of how to do efficient parameter estimation and synthesis with the autoregressive HMM and discuss consequences of the autoregressive HMM model. There are substantial similarities between the three models, which we explore. The adv...
متن کاملAutoregressive HMMs for speech synthesis
We propose the autoregressive HMM for speech synthesis. We show that the autoregressive HMM supports efficient EM parameter estimation and that we can use established effective synthesis techniques such as synthesis considering global variance with minimal modification. The autoregressive HMM uses the same model for parameter estimation and synthesis in a consistent way, in contrast to the stan...
متن کاملThe Effect of Using Normalized Models in Statistical Speech Synthesis
The standard approach to HMM-based speech synthesis is inconsistent in the enforcement of the deterministic constraints between static and dynamic features. The trajectory HMM and autoregressive HMM have been proposed as normalized models which rectify this inconsistency. This paper investigates the practical effects of using these normalized models, and examines the strengths and weaknesses of...
متن کاملMinimum generation error criterion for tree-based clustering of context dependent HMMs
Due to the inconsistency between HMM training and synthesis application in HMM-based speech synthesis, the minimum generation error (MGE) criterion had been proposed for HMM training. This paper continues to apply the MGE criterion for tree-based clustering of context dependent HMMs. As directly applying the MGE criterion results in an unacceptable computational cost, the parameter updating rul...
متن کاملDecision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis
In order to express natural prosodic variations in continuous speech, sophisticated speech units such as the contextdependent phone models are usually employed in HMM-based speech synthesis techniques. Since the training database cannot practically cover all possible context factors, decision treebased HMM states clustering is commonly applied. One of the serious problems in a decision tree-bas...
متن کامل